LLM Interpretation of Topic Models

Author

Doron Feingold

Published

September 8, 2025

Closeup of the Sovereign Thrones

Overview

This document uses a Large Language Model (LLM) to interpret the topics generated by our LDA models in the previous step. The ‘topics’ generated by the models are collections of statistically related words. While these may be meaningful to a subject-matter expert, manual interpretation is a laborious process that can lead to inevitable subjectivity. We therefore chose to use an LLM to automate this step, prioritizing speed and reproducibility.

For this task, we selected Anthropic’s Claude model, which has a reputation for nuanced cultural and historical interpretation. We’ll use its API to interpret the topics from each of our k models, providing it with the highest-probability words (ranked by their beta values) for each topic.

Setup

First, we load the necessary libraries and data.

Show code
# Seed
set.seed(1867)

# Load libraries
library(dplyr)
library(tidytext)
library(tidyr)
library(stringr)
library(ggplot2)
library(topicmodels)
library(purrr)
library(httr2)
library(jsonlite)
library(readr)
library(kableExtra)

# Load project-specific functions
# Get more detailed error
tryCatch(
  {
    source("R/functions.R")
  },
  error = function(e) {
    cat("Detailed error:\n")
    cat("Message:", e$message, "\n")
    cat("Call:", deparse(e$call), "\n")
  }
)

# Load trained models and corpus metadata
lda_models <- readRDS("output/models/lda_models.rds")
k_values <- c(4, 8)

# Set up API Key
ANTHROPIC_API_KEY <- Sys.getenv("ANTHROPIC_API_KEY")

Prompt Engineering for Interpretation

We create a new function interpret_topic_set_with_llm that formats the top words from all topics in a model into a single prompt. The prompt instructs the LLM to return a JSON array, with one object for each topic’s interpretation.

Prompt:

You are an expert political historian specializing in Canadian policy. I have a set of {k_value} topics from a topic model of Canadian Throne Speeches. Your task is to provide a concise interpretation for EACH topic based on its most probable words, considering the context of the other topics.

Here is the full set of topics and their words: {topics_words}

Please provide your response as a valid JSON array (and ONLY the JSON array, no markdown formatting or extra text).

Each element should be an object with these three keys: 1. ‘topic_id’: The integer topic number. 2. ‘label’: A short, descriptive topic label of 3-5 words (e.g., ‘National Defense & Foreign Affairs’). 3. ‘focus’: A single sentence describing the policy area this topic represents.

Return only the JSON array, starting with [ and ending with ]:

Iteratively Interpret Each Topic Set

Now we’ll loop through each model, making a call using interpret_topic_set_with_llm, and parse the JSON array result.

Note: To reproduce this part you must have an API key with Anthropic. It is not free but $5 credit (the minimum amount) should be enough. Running this loop several times during testing, cost about $0.15. It should not be hard to convert the function to another LLM if cost is an issue.

Show code
# This code iterates through each MODEL
llm_interpretations <- purrr::map2_dfr(
  lda_models,
  k_values,
  function(model, k) {
    # Get the top 15 terms for all topics in THIS model
    topics_for_model <- tidy(model, matrix = "beta") %>%
      group_by(topic) %>%
      slice_max(beta, n = 25) %>%
      summarise(terms = paste(term, collapse = ", "), .groups = "drop")

    result <- interpret_topic_set_with_llm(
      k,
      topics_for_model,
      ANTHROPIC_API_KEY
    )

    if (!is.null(result$error)) {
      log_message(
        paste("API ERROR for k=", k, ":", result$error),
        "llm_interpretation_log.txt"
      )
      return(tibble(
        k = k,
        topic = NA,
        label = "API_ERROR",
        focus = result$error
      ))
    }

    json_text <- result$content[[1]]$text

    log_message(
      paste("Raw response for k=", k, ":", substr(json_text, 1, 200), "..."),
      "llm_interpretation_log.txt"
    )

    clean_json <- clean_json_response(json_text)

    parsed_json <- tryCatch(
      {
        fromJSON(clean_json, flatten = TRUE)
      },
      error = function(e) {
        log_message(
          paste(
            "JSON parsing error for k=",
            k,
            ":",
            e$message,
            "\nCleaned JSON:",
            clean_json
          ),
          "llm_interpretation_log.txt"
        )
        return(NULL)
      }
    )

    if (is.null(parsed_json)) {
      return(tibble(
        k = k,
        topic = NA,
        label = "JSON_PARSE_ERROR",
        focus = "Could not parse JSON response"
      ))
    }

    # Return a tidy data frame, adding the k-value and topic words
    parsed_json %>%
      as_tibble() %>%
      rename(topic = topic_id) %>%
      mutate(k = k) %>%
      left_join(topics_for_model, by = "topic")
  }
)

# Save the resulting interpretations
saveRDS(llm_interpretations, "output/models/llm_topic_labels.rds")

Detailed Topic Interpretations

Interpretations for k = 4

LLM Interpretations for 4 Topics
Topic LLM Label LLM Focus Top 15 Words Sent to LLM
1 Parliamentary Procedure & Administration This topic covers the formal parliamentary processes, legislative submissions, and administrative matters including railways, public accounts, and British colonial governance structures. submit, lay, last, railway, subject, public, trade, present, state, unite, session, may, parliament, attention, law, west, estimate, british, past, upon, consideration, measure, relate, account, service
2 Social Policy & Community Welfare This topic encompasses domestic social programs focused on health care, family support, job creation, and community building including Aboriginal affairs. community, health, world, build, support, help, good, economy, family, child, job, care, opportunity, economic, can, create, strong, system, first, investment, action, strengthen, plan, time, aboriginal
3 International Affairs & Defense This topic addresses Canada’s international relations, military affairs, trade agreements, and participation in global conferences and diplomatic initiatives. minister, war, house, unite, nation, force, trade, legislation, international, parliament, last, agreement, service, development, provision, consider, amendment, member, common, submit, world, continue, session, conference, measure
4 Economic Development & Growth This topic focuses on federal economic policy, including resource development, market growth, provincial-federal cooperation, and programs to stimulate economic development. economic, development, program, federal, opportunity, legislation, improve, social, parliament, economy, world, international, can, minister, service, change, introduce, resource, increase, growth, public, encourage, market, provincial, good

Interpretations for k = 8

LLM Interpretations for 8 Topics
Topic LLM Label LLM Focus Top 15 Words Sent to LLM
1 Parliamentary Business & Trade This topic represents formal parliamentary procedures, legislative sessions, and trade relations with emphasis on British Commonwealth connections and railway development. last, submit, may, trade, public, state, session, unite, railway, attention, parliament, conference, present, result, subject, early, british, lay, country, upon, house, time, product, respect, regard
2 Social Programs & Healthcare This topic focuses on healthcare systems, community development, family support, and Aboriginal affairs within the broader context of economic and social investment. health, community, world, child, economy, opportunity, build, help, good, can, support, aboriginal, care, government, life, development, economic, strong, system, investment, need, improve, challenge, family, quality
3 War & International Relations This topic addresses wartime governance, military forces, international agreements, and peacekeeping efforts during periods of global conflict. war, house, force, unite, nation, parliament, minister, provision, common, service, world, submit, member, trade, session, condition, agreement, last, present, peace, upon, may, consideration, necessary, international
4 Jobs & Economic Policy This topic centers on employment creation, business support, taxation, and economic security measures to strengthen industry and protect families. job, support, family, good, world, economic, build, community, legislation, help, protect, plan, business, introduce, create, economy, tax, trade, action, security, system, future, industry, time, strengthen
5 Infrastructure & Regional Development This topic covers railway expansion, regional development in western and northern Canada, and public infrastructure projects requiring legislative consideration. lay, railway, submit, last, subject, public, west, law, present, trade, relate, parliament, past, measure, unite, north, now, session, consideration, state, several, estimate, upon, service, report
6 Federal-Provincial Coordination This topic addresses federal-provincial relations, legislative amendments, international agreements, and coordinated program development across government levels. development, minister, legislation, amendment, nation, unite, international, house, consider, propose, provincial, trade, programme, assistance, approve, federal, continue, measure, place, increase, economic, agreement, last, service, force
7 Indigenous Affairs & Climate This topic focuses on Indigenous communities, climate change action, pandemic response, and contemporary social challenges requiring government intervention. community, build, support, help, economy, action, indigenous, good, include, first, change, job, create, health, world, care, pandemic, protect, family, time, safe, home, climate, invest, plan
8 Economic Development Programs This topic encompasses federal economic programs, social development initiatives, resource management, and legislative measures to promote growth and societal improvement. economic, program, development, federal, social, opportunity, parliament, minister, improve, legislation, resource, change, can, society, service, world, economy, international, public, encourage, introduce, increase, one, growth, good

Conclusion

Now that we have interpretations for our topics, we are ready to analyze the models’ gamma values—the proportion of each topic within an individual speech. In the final step, we will visualize how these topic proportions change over time, mapping the thematic shifts in Canada’s political discourse.

Next Step: Topic Shift Analysis & Validation →